accelerating recurrent network training
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. Current RNN models are ill suited to process irregularly sampled data triggered by events generated in continuous time by sensors or other neurons. Such data can occur, for example, when the input comes from novel event-driven artificial sensors which generate sparse, asynchronous streams of events or from multiple conventional sensors with different update intervals. In this work, we introduce the Phased LSTM model, which extends the LSTM unit by adding a new time gate. This gate is controlled by a parametrized oscillation with a frequency range which require updates of the memory cell only during a small percentage of the cycle. Even with the sparse updates imposed by the oscillation, the Phased LSTM network achieves faster convergence than regular LSTMs on tasks which require learning of long sequences. The model naturally integrates inputs from sensors of arbitrary sampling rates, thereby opening new areas of investigation for processing asynchronous sensory events that carry timing information. It also greatly improves the performance of LSTMs in standard RNN applications, and does so with an order-of-magnitude fewer computes.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Reviews: Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
LSTMs (and GRUs) are increasingly used as basic building blocks in neural network architectures, both in inherently sequential problems but also in other applications as many other problems can usefully be decomposed into sequential problems using mechanisms such as attention. Despite being devised some time ago, LSTMs have proved to be difficult to beat as a general purpose tools for modeling sequential structures (e.g. This paper presents an interesting idea for improving the performance of LSTMs, particularly on tasks which contain cyclical structure. It is novel and explains the model and motivations well. There are aspect of the analysis and experimental results which could be improved on, but it is a novel approach that will be of interest to the field. I have several suggestions for improvements below, but these do not significantly detract from the work, which is of a high standard.
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. However, current RNN models are ill-suited to process irregularly sampled data triggered by events generated in continuous time by sensors or other neurons. Such data can occur, for example, when the input comes from novel event-driven artificial sensors that generate sparse, asynchronous streams of events or from multiple conventional sensors with different update intervals. In this work, we introduce the Phased LSTM model, which extends the LSTM unit by adding a new time gate. This gate is controlled by a parametrized oscillation with a frequency range that produces updates of the memory cell only during a small percentage of the cycle. Even with the sparse updates imposed by the oscillation, the Phased LSTM network achieves faster convergence than regular LSTMs on tasks which require learning of long sequences. The model naturally integrates inputs from sensors of arbitrary sampling rates, thereby opening new areas of investigation for processing asynchronous sensory events that carry timing information. It also greatly improves the performance of LSTMs in standard RNN applications, and does so with an order-of-magnitude fewer computes at runtime.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Neil, Daniel, Pfeiffer, Michael, Liu, Shih-Chii
Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. Current RNN models are ill suited to process irregularly sampled data triggered by events generated in continuous time by sensors or other neurons. Such data can occur, for example, when the input comes from novel event-driven artificial sensors which generate sparse, asynchronous streams of events or from multiple conventional sensors with different update intervals. In this work, we introduce the Phased LSTM model, which extends the LSTM unit by adding a new time gate. This gate is controlled by a parametrized oscillation with a frequency range which require updates of the memory cell only during a small percentage of the cycle.
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Neil, Daniel, Pfeiffer, Michael, Liu, Shih-Chii
Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. Current RNN models are ill suited to process irregularly sampled data triggered by events generated in continuous time by sensors or other neurons. Such data can occur, for example, when the input comes from novel event-driven artificial sensors which generate sparse, asynchronous streams of events or from multiple conventional sensors with different update intervals. In this work, we introduce the Phased LSTM model, which extends the LSTM unit by adding a new time gate. This gate is controlled by a parametrized oscillation with a frequency range which require updates of the memory cell only during a small percentage of the cycle. Even with the sparse updates imposed by the oscillation, the Phased LSTM network achieves faster convergence than regular LSTMs on tasks which require learning of long sequences. The model naturally integrates inputs from sensors of arbitrary sampling rates, thereby opening new areas of investigation for processing asynchronous sensory events that carry timing information. It also greatly improves the performance of LSTMs in standard RNN applications, and does so with an order-of-magnitude fewer computes.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)